Skip to content

[Codegen] Apply bounds to subgroup_id#23768

Open
krzysz00 wants to merge 2 commits intoiree-org:mainfrom
krzysz00:subgroup-id-bounds
Open

[Codegen] Apply bounds to subgroup_id#23768
krzysz00 wants to merge 2 commits intoiree-org:mainfrom
krzysz00:subgroup-id-bounds

Conversation

@krzysz00
Copy link
Contributor

@krzysz00 krzysz00 commented Mar 12, 2026

Since we're fixing to start using subgroup_id more often with PCF, apply bounds to it based on the subgroup size (or sizes) and the number of threads in the workgroup. Also extend subgroup_size handling to account for known subgroup sizes instead of giving up completely when there isn't a fixed choice made yet.

Also fix up some double-spaces in test attributes.

Copy link
Contributor

@amd-eochoalo amd-eochoalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but you may want to wait for other reviewers. Just one comment about the for-loop.

Comment on lines +215 to +229
for (std::optional<int64_t> size : workgroupSizes) {
if (size) {
maxTotalThreads *= *size;
// Cap at the hardware thread-per-workgroup limit inside the loop
// to avoid overflow from multiplying per-dimension maximums.
if (target) {
maxTotalThreads =
std::min(maxTotalThreads,
static_cast<int64_t>(
target.getWgp().getMaxThreadCountPerWorkgroup()));
}
} else {
allSizesKnown = false;
break;
}
Copy link
Contributor

@amd-eochoalo amd-eochoalo Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this is a little bit more difficult to parse than it should be. Could we remove one level of nesting by moving the break earlier and then the else should not be necessary, right? Maybe changing the logic slightly too for calculating the maxTotalThreads when target is known and avoid overflow somehow or adding variables?

// pseudocode

Suggested change
for (std::optional<int64_t> size : workgroupSizes) {
if (size) {
maxTotalThreads *= *size;
// Cap at the hardware thread-per-workgroup limit inside the loop
// to avoid overflow from multiplying per-dimension maximums.
if (target) {
maxTotalThreads =
std::min(maxTotalThreads,
static_cast<int64_t>(
target.getWgp().getMaxThreadCountPerWorkgroup()));
}
} else {
allSizesKnown = false;
break;
}
// if target is available
std::optional<int64_t> maxThreadCountPerWg = static_cast<int64_t> (target.getWgp().getMaxThreadCountPerWorkgroup()));
for (std::optional<int64_t> size : workgroupSizes) {
if (!size) {
allSizesKnown = false;
break;
}
maxTotalThreads *= *size;
// Cap at the hardware thread-per-workgroup limit inside the loop
// to avoid overflow from multiplying per-dimension maximums.
if (maxThreadCountPerWg) {
maxTotalThreads = std::min(maxTotalThreads, maxThreadCountPerWg)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored all that code - do you like the new version?

krzysz00 and others added 2 commits March 13, 2026 17:35
Since we're fixing to start using subgroup_id more often with PCF,
apply bounds to it based on the subgroup size (or sizes) and the
number of threads in the workgroup. Also extend subgroup_size handling
to account for known subgroup sizes instead of giving up completely
when there isn't a fixed choice made yet.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@krzysz00 krzysz00 force-pushed the subgroup-id-bounds branch from ad04bc3 to 8adfb9f Compare March 13, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants